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Abstract — Physiological  waveform  signals  collected  from 
unstructured  environments  are  noisy,  requiring  automated 
algorithms  to  assess  the  reliability  of  the  derived  vital  signs,  such 
as  heart  rate  (HR)  and  respiratory  rate  (RR),  before  they  can  be 
used  for  automated  decision  support.  We  recently  proposed  a 
weighted  regularized  least  squares  method  to  estimate 
instantaneous  HR  (HRr),  which  readily  provides  analytically 
based  confidence  intervals  (CIs).  Accordingly,  this  method  can 
be  extended  to  the  estimation  of  instantaneous  RR  (RRr).  In  this 
study,  we  aim  to  investigate  whether  we  can  use  CIs  to  select 
reliable  HRr  and  RRr.  We  calculated  HRr  and  RRr  for  532  and 
370  trauma  patients,  respectively,  grouped  the  rates  according 
to  their  CIs,  and  investigated  their  reliability  by  determining 
their  ability  to  diagnose  major  hemorrhage.  The  areas  under  a 
receiver  operating  characteristic  curve  of  HRr  and  RRr  with 
Cl  <  5  bpm  (beats  per  minute  for  HR  and  breaths  per  minute 
for  RR)  were  0.70  and  0.66,  respectively.  RRr  was  superior  to 
the  average  output  of  the  clinical  monitor  (p  <  0.05  by  DeLong’s 
test),  while  HRr  was  equivalent.  HRr  and  RRr  provide  a  new 
approach  to  systematically  and  automatically  assess  the 
reliability  of  noisy,  field-collected  vital  signs. 

I.  Introduction 

HYSIOLOGICAL  waveform  signals  collected  from 
trauma  patients  during  transport  from  the  scene  of  an 
accident  to  a  trauma  center  are,  usually,  severely 
contaminated  with  noise.  Vital  signs  derived  from  such  noisy 
waveform  recordings  are  therefore  frequently  inaccurate, 
precluding  their  use  in  automated  decision-support 
algorithms.  To  address  this  challenge,  our  group  previously 
developed  physiological  data  qualification  algorithms  that 
automatically  assess  the  reliability  of  major  vital  signs,  such 
as  heart  rate  (HR)  and  respiratory  rate  (RR)  [1],  [2].  While 
these  algorithms  have  been  shown  to  match  the  assessments 
made  by  human  experts  and  significantly  improve  the 
accuracy  of  automated  decision-support  algorithms  [3],  they 
have  some  shortcomings:  they  are  not  designed  for 
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computing  instantaneous  rates,  require  the  availability  of 
redundant  sensor  measurements,  and  are  based  on  heuristic 
criteria. 

To  resolve  these  shortcomings,  we  recently  proposed  a 
robust  method  for  estimating  instantaneous  HR  from 
noise-laden  electrocardiogram  (ECG)  waveforms  with 
normal  sinus  rhythm  (i.e.,  no  arrhythmia)  [4].  This  method 
implements  a  weighted  regularized  least  squares  (WRLS) 
algorithm  for  accurate  HR  estimation  [regularized  HR  (HRr)] 
and,  importantly,  provides  a  systematic,  analytically  based 
approach  to  compute  confidence  intervals  (CIs),  which  reflect 
uncertainties  in  the  estimated  HRr.  We  have  shown  that  the 
CIs  capture  the  noise  level  in  ECG  waveforms:  large  CIs 
reflecting  high  levels  of  ECG  noise  and  vice  versa  [4].  In 
addition,  the  method  can  be  readily  extended  to  estimating 
instantaneous  RR  [regularized  RR  (RRr)]  from  respiratory 
waveforms.  In  this  study,  we  aim  to  investigate  whether  CIs 
can  be  used  to  select  reliable  HRr  and  RRr.  More 
specifically,  we  divided  the  CIs  into  three  non-overlapping 
ranges,  and  compared  the  extent  to  which  HRr  and  RRr  with 
smaller  CIs  (i.e.,  the  more  reliable  rates)  were  able  to  improve 
the  detection  of  major  hemorrhage  in  trauma  patients. 

II.  Methods 

A.  Respiratory’  Rate  and  Confidence  Interval  Estimation 

The  estimation  of  HRr  and  its  associated  Cl  is  described  in 
detail  in  [4].  Here,  we  summarize  the  estimation  of  RRr  and 
its  Cl  in  an  analogous  manner. 

Because  the  low-frequency  respiratory  signal  is  subject  to 
movement  artifacts  and  erroneous  placement  of  sensor 
electrodes  on  the  body  [5],  [6],  the  corresponding  respiratory 
waveforms  are  usually  characterized  by  low  signal-to-noise 
ratios.  Therefore,  before  estimating  the  instantaneous  RR,  we 
first  denoised  the  respiratory  waveforms  with  a  smoothing 
algorithm  developed  by  our  group  [2].  Second,  we  detected 
the  local  maxima  in  the  denoised  respiratory  waveform  and 
formed  a  time  series  of  the  cumulative  peak  occurrence  times 
(Pi),  0  <  Pj  <  P2<  ...  <  Pn<  T,  where  N  is  the  total  number  of 
the  cumulative  peak  occurrence  times  and  T  is  the  length  of 
the  denoised  respiratory  waveform. 

Third,  we  formulated  the  cumulative  peak  occurrence  time 
P  as  an  integration  of  the  peak-to-peak  interval  (PPI), 

P  =  A  ■  PPI  +  e,  (1) 

where  P  denotes  an  N  x  1  vector  of  measured  cumulative 
peak  occurrence  times  (in  seconds),  A  denotes  an  N  x  N 
lower  triangular  integration  matrix  with  all  non-zero  elements 
equal  to  one,  £  represents  an  N  x  1  vector  of  measurement 
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noise  in  P,  and  the  PPI  was  estimated  as  the  solution  to  an 
ordinary  least  squares  (OLS)  problem, 

PPIols  =  t(AT  •  A)"1  •  At]  •  P,  (2) 

where  PPIols  represents  an  N  x  1  vector  of  the  estimated  PPI 
values.  Consequently,  the  OLS  solution  of  RR  was  RRols  = 
60/PPIqls  in  breaths  per  minute  (bpm). 

Next,  by  applying  the  WRLS  algorithm  as  described  in  [4], 
we  calculated  the  regularized  PPI  (PPIr)  as: 

PPIR  =  (WT  ■  At  •  A  •  W  +  X2  ■  Lt  ■  L)-1 

■Wt-At-  A-  W-  PPIols,  (3) 

where  W  denotes  a  diagonal  N  x  N  weighting  matrix,  whose 
elements  are  either  zeros  (represented  by  10"5),  for  spike-like 
outliers  detected  in  PPIols  (and  RRols)  via  an  impulse 
rejection  filter  [7],  or  ones  for  non-outliers,  L  denotes  a 
smoothing  matrix  that  constrains  high-frequency  noise 
amplification  in  the  PPI  estimates  and  produces  a  smooth  and 
consistent  solution,  and  X  represents  a  positive  regularization 
parameter,  which  controls  the  tradeoff  between  the  fit  to  the 
data  and  the  smoothness  of  the  solution.  A  standard  choice  for 
L  (and  the  one  used  here)  is  to  use  an  (N-2)  x  2  matrix 
representing  a  second-order  derivative  [8],  We  customized  X 
for  each  patient.  In  particular,  starting  with  X  =  0  (i.e.,  no 
regularization),  we  incrementally  increased  it  until  the 
absolute  time  rate  of  change  of  the  estimated  RRs  dropped 
below  a  specified  threshold  of  8.0  bpm/s,  which  represents 
the  average  absolute  time  rate  of  change  of  RRs  estimated 
from  clean  respiratory  waveform  segments  in  our  trauma 
patient  database  [9].  Accordingly,  we  calculated  RRr  as 
RRr  =  60/PPIr  in  bpm. 

Finally,  we  computed  the  Cl  for  the  estimated  RRr  through 
a  standard  formulation  [10]: 

Cl  =  RRr  ±  td2  •  ^/Var(RRR)  (4) 

where  to/2  denotes  a  percentile  of  a  Student’s  t-distribution 
with  a  significance  level  of  a  and  Var(RRR)  represents  the 
variance  of  RRr.  The  derivation  of  Var(RRR)  was  analogous 
to  the  one  described  in  [4].  Here  we  used  a  =  0.05  and  t0025  = 
1.96  for  95%  CL 

B.  Study  Data 

In  this  study  we  used  both  discrete  attribute  data  and 
physiological  time-series  data  collected  from  898  trauma 
casualties  during  and  after  transport  by  helicopter  service 
from  the  scene  of  injury  to  the  Level-I  unit  at  the  Memorial 
Hermann  Hospital  in  Houston,  TX  [9].  The  time-series 
variables  were  measured  by  Propaq  206EL  vital-sign 
monitors  (Welch  Allyn;  Skaneateles  Falls,  NY),  downloaded 
to  an  attached  personal  digital  assistant,  and  ultimately  stored 
in  our  database.  The  physiological  data  include  ECG 
waveforms  (sampled  at  182  Hz),  respiratory  waveforms 
(sampled  at  23  Hz),  their  corresponding  monitor-computed 
HR  and  RR  (recorded  at  1-s  intervals),  and  other  vital-sign 
data  described  elsewhere  [11].  Patient  attribute  data,  such  as 
demographics,  injury  description,  and  treatments,  were  also 


collected  via  chart  review.  Data  were  collected  and 
retrospectively  analyzed  with  the  approval  of  the  local  and  the 
U.S.  Army’s  human  subjects  Institutional  Review  Board,  Fort 
Detrick,  MD. 

C.  Outcome:  Major  Hemorrhage  vs.  Control 

Our  analyses  required  that  we  distinguished  major 
hemorrhage  patients  from  controls.  Accordingly,  patients 
with  major  hemorrhage  were  defined  as  those  who  received 
one  or  more  units  of  packed  red  blood  cell  transfusion  within 
24  h  upon  arrival  at  the  hospital  and  had  a  documented  injury 
that  was  explicitly  hemorrhagic,  which  was  one  or  more  of 
the  following:  (a)  laceration  or  fracture  of  a  solid  organ,  (b) 
thoracic  or  abdominal  hematomas,  (c)  explicit  vascular  injury 
that  required  operative  repair,  or  (d)  limb  amputation. 
Patients  who  received  blood  but  did  not  meet  the  documented 
injury  criteria,  i.e.,  ambiguous  hemorrhagic  patients,  and 
patients  who  died  before  arrival  at  the  hospital  were  excluded 
from  the  analysis.  The  remaining  patients  were  labeled  as 
controls. 

D.  Data  Analysis 

We  investigated  the  ability  of  CIs  to  select  reliable  HRr 
and  RRr  by  determining  the  extent  to  which  HRr  and  RRr 
with  different  CIs  could  distinguish  major  hemorrhage 
patients  from  controls.  Thus,  we  divided  the  CIs  into  three 
non-overlapping  ranges,  Cl  <  5,  5  <  Cl  <  20,  and  Cl  >  20  bpm 
(beats  per  minute  for  HR  and  breaths  per  minute  for  RR),  and 
selected  different  study  populations  for  HR  and  RR  so  that 
each  patient  in  each  study  population  had  at  least  one  HRr 
value  (or  RRr  value,  for  the  corresponding  study  population) 
in  each  of  the  three  ranges. 

We  evaluated  the  diagnostic  performance  of  the  averaged 
HRr  and  RRr  in  each  range  by  performing  univariate  analysis 
to  distinguish  between  major  hemorrhage  and  control 
patients,  constructing  receiver  operating  characteristic  (ROC) 
curves,  and  calculating  the  areas  under  the  ROC  curves 
(AUCs).  We  computed  the  ROC  AUCs  using  DeLong’s 
method  [12].  For  comparison,  we  calculated  ROC  AUCs  for 
the  monitor-computed  Propaq  HR  (HRP)  and  RR  (RRp), 
which  were  averaged  over  all  available  data  for  each  patient. 
We  also  computed  ROC  AUCs  for  the  HR  and  RR  calculated 
from  our  previously  developed  algorithms  (HRC  and  RRc), 
which  were  averaged  over  only  reliable  rates  whose  quality 
index  (QI)  was  >  2  [1],  [2].  In  the  analyses  of  the  Propaq  and 
Ql-qualified  rates,  we  used  the  same  study  populations  as  the 
ones  used  to  assess  the  reliability  of  the  CIs. 

We  applied  the  Pearson’s  Chi-square  test  to  compare  the 
population  demographics  between  the  total  population  and 
the  study  sub-populations  (except  for  the  comparison  of  mean 
ages,  where  we  used  the  Student’s  t-test),  and  DeLong’s 
test  to  compare  the  ROC  AUCs.  We  considered  a  p-value 
of  <  0.05  to  be  statistically  significant. 

III.  Results 

A.  Population  Statistics 

The  HR  study  population  consisted  of  470  controls  and  62 
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hemorrhage  patients,  and  the  RR  study  population  consisted 
of  323  controls  and  47  hemorrhage  patients.  Table  I  shows  the 
summary  statistics  of  the  populations.  Both  the  HR  and  RR 
study  populations  had  demographics  similar  to  those  of  the 
total  population,  except  that  both  study  populations  had  lower 
mortality  rates  (p  <  0.05),  which  was  in  accordance  to  our 
prior  finding  that  higher-acuity  casualties  tend  to  have  noisier 
data  [2],  [13],  and  those  patients  were  not  included  because 
they  lacked  HRr  and  RRr  with  small  CIs. 

B.  Examples  of  Calculated  Confidence  Intervals 

Figure  1A  shows  80  seconds  (s)  of  a  normalized  ECG 

record  for  a  typical  patient  in  our  trauma  database,  where  the 
ECG  was  contaminated  with  spike  noise  for  the  segment 
between  505-550  s  and  relatively  clean  for  the  remaining 
segments.  Figure  IB  shows  the  corresponding  HRr  in  beats 
per  minute  (bpm)  derived  with  the  WRLS  algorithm,  where 
the  vertical  bars  indicate  the  width  of  the  compuated  CIs. 
Here,  we  labeled  HRr  as  reliable  for  those  with  Cl  <  5  bpm, 
which  coincided  with  the  clean  ECG  segments. 

Similarly,  Figure  2  (A  and  B)  shows  80  s  of  a  normalized 
respiratory  waveform  record  for  a  different  patient  in  our 
trauma  database  and  the  associated  RRr  and  Cl  estimates.  We 
identified  as  reliable  RRr  those  with  CIs  <  5  bpm,  which 
corresponded  to  the  relatively  clean  respiratory  waveforms 
outside  of  the  560-580  s  segment.  Conversely,  the  RRr 
within  this  noise-corrupted  segment  were  characterized  by 
large  CIs,  indicating  their  unreliable  nature. 

C.  Confidence  Interval  Performance  Evaluation 

Table  II  summarizes  the  ROC  AUCs  of  HRr  and  RRr  for 
the  three  Cl  ranges  in  the  detection  of  major  hemorrhage  in 
trauma  patients.  For  comparison,  it  also  includes  the  ROC 
AUCs  of  HRP  and  RRp  and  of  the  reliable  HRC  and  RRC.  In 
general,  ROC  AUCs  of  HRr  and  RRr  increased  with  smaller 
CIs.  While  the  HRr  result  for  Cl  <  5  bpm  was  no  different 
from  that  of  HRP,  the  improvement  of  RRr  with  Cl  <  5  bpm 
over  RRP  was  statistically  significant.  The  results  of  our 
previously  developed  algorithms  were  consistently,  but  not 
statistically  significantly,  better  than  those  obtained  with 
regularized  rates  for  Cl  <  5  bpm. 

IV.  Discussion  and  Conclusions 

In  this  study,  we  explored  the  utility  of  statistically  based 
CIs  to  assess  the  reliability  of  field-collected  HRs  and  RRs. 
We  evaluated  the  ability  of  the  CIs  to  yield  reliable  rates  by 
using  HRs  and  RRs  with  different  CIs  to  diagnose  major 
hemorrhage  in  trauma  patients,  knowing  that  more  reliable 
HR  and  RR  estimates  would  offer  better  diagnostic  value  [3], 
Our  major  finding  was  that  HRr  and  RRr  computed  from 
smooth  and  clean  waveforms  (assessed  by  Cl  <  5  bpm)  were 
statistically  significantly  more  diagnostic  than  those  from 
noisy  or  arrhythmic  waveforms  (assessed  by  Cl  >  20  bpm), 
for  diagnosing  major  hemorrhage.  This  suggested  that  the 
regularized  rates  with  smaller  CIs  were  physiologically  more 
informative  (i.e.,  more  reliable)  and  provided  superior  clinical 
information  for  trauma  patients,  where  arrhythmia  was 


seldom  observed. 


TABLE i 

DEMOGRAPHICS  OF  THE  TOTAL  AND  STUDY  POPULATIONS 


Characteristics 

Total 

Population 

HR  Study 
Population3 

RR  Study 
Population3 

Population  size 

898 

532 

370 

Male 

660b  (73%) 

394  (74%) 

279  (75%) 

Female 

234  (26%) 

137  (26%) 

91  (25%) 

Mean  age,  yr 

38  (SD  15) 

38  (SD  15) 

38  (SD  15) 

Blunt  injury 

778c  (87%) 

476  (89%) 

326  (88%) 

Penetrating 

injury 

10T(H%) 

49  (9%) 

38(10%) 

Mortality 

94(10%) 

34  (6%) 

22  (6%) 

Major 

hemorrhage 

97(11%) 

62(12%) 

47(13%) 

HR,  heart  rate;  RR,  respiratory  rate;  SD,  standard  deviation. 
aHR  (or  RR)  Study  Population  is  the  subset  of  patients  found  to  have 
regularized  HRs  (or  RRs)  from  each  of  the  three  confidence  interval  ranges 
bFour  patients  had  no  assigned  gender  in  the  total  population 
‘‘Nineteen  patients  had  no  assigned  mechanism  of  injury 


Time  (s) 


Time  (s) 


Fig.  1.  (A)  Electrocardiogram  (ECG)  waveform  and  (B)  corresponding 
regularized  heart  rates  (HRr)  and  associated  confidence  intervals  (CIs; 
vertical  bars).  Noisy  waveform  segments  are  characterized  by  HRr  with  large 
CIs,  whereas  clean  segments  give  rise  to  reliable  HRr  (CIs  <  5  bpm). 
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Fig.  2.  (A)  Respiratory  waveform  and  (B)  corresponding  regularized 
respiratory  rates  (RRr)  and  associated  confidence  intervals  (CIs;  vertical 
bars).  Noisy  waveform  segments  are  characterized  by  RRr  with  large  CIs, 
whereas  clean  segments  give  rise  to  reliable  RRr  (CIs  <  5  bpm). 
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TABLE  II 

DIAGNOSTIC  ABILITY  OF  REGULARIZED  HRs  AND  RRs  WITH 
DIFFERENT  CIS  TO  DIAGNOSE  MAJOR  HEMORRHAGE 


Vital  Signs 

Data  Selection 

ROC  AUC  (95%  Cl) 

HRr 

Cl  <  5  bpm 

0.70  (0.62-0.77) 

5  <  Cl  <  20  bpm 

0.70  (0.62-0.77) 

Cl  >  20  bpm 

0.66  (0.58-0.73)“b 

HRp 

All  available  data 

0.70  (0.63-0.77) 

HRc 

QI  >  2 

0.72  (0.64-0.78) 

RRr 

Cl  <  5  bpm 

0.66  (0.57-0.73)* 

5  <  Cl  <  20  bpm 

0.62  (0.54-0.70) 

Cl  >  20  bpm 

0.60  (0.52-0.67)b 

RRp 

All  available  data 

0.54  (0.45-0. 63)b 

RRC 

QI  >  2 

0.71  (0.61-0.79)* 

HRr  /  RRr,  regularized  heart  rate  /  respiratory  rate;  HRp  /  RRp,  Propaq  heart 
rate  and  respiratory  rate;  HRC  /  RRc,  heart  rate  /  respiratory  rate  from  prior 
reliability  algorithms  that  assess  morphology  of  the  source  waveforms  [1], 

[2];  Cl,  confidence  interval;  QI,  quality  index;  ROC  AUC,  area  under  the 
receiver  operating  characteristic  curve. 

aSignificantly  different  (p  <  0.05  by  DeLong’s  test)  from  HRP  /  RRP 
bSignificantly  different  (p  <  0.05  by  DeLong’s  test)  from  HRr  /  RRr  when 
Cl  <  5  bpm 

When  compared  to  the  Propaq  vital  signs,  HRr  yielded  no 
improvement,  while  RRr  was  significantly  more  diagnostic. 
We  believe  this  is  because  measurement  errors  in  HRP  were 
without  bias,  so  any  errors  were  filtered  out  by  taking  an 
average  value  over  several  minutes.  By  contrast,  RRP  errors 
tended  to  be  falsely  elevated  (i.e.,  motion  artifact  was 
counted  falsely  as  a  breath).  As  a  result,  the  average  RRP 
yielded  a  significantly  worse  ROC  AUC  than  that  of  RRr 
when  Cl  <  5  bpm. 

When  compared  to  the  vital  signs  from  our  previously 
developed  Ql  algorithms,  HRr  and  RRr  were  slightly  (though 
not  statistically)  worse.  The  prior  algorithms  apply  a  set  of 
heuristic  rules  involving  the  shape,  timing,  and  frequency 
characteristics  of  the  source  waveforms  (ECG  and  respiratory 
waveform)  to  determine  when  the  measurements  are  reliable 
[1],  [2].  By  contrast,  reliability  for  HRr  and  RRr  are  based 
entirely  on  the  timing  of  the  heartbeats  /  breaths,  that  is,  the 
difference  between  the  OLS  solution  and  the  regularized  rates 
where  larger  differences  yield  larger  CIs  [4].  Another  reason 
is  because  HRr  and  RRr  are  instantaneous  rates,  while  HRC 
and  RRC  are  average  rates  over  7-  and  15-s  data  segments, 
respectively,  which  helps  in  further  suppressing 
high-frequency  noise  in  the  estimated  rates  and  improving 
estimation  accuracy.  Nevertheless,  we  believe  that  a  slight 
decrement  in  performance  of  the  proposed  algorithm  over  the 
previously  developed  QI  algorithms  may  be  an  acceptable 
trade  off  for  certain  applications,  because  the  older  QI 
algorithms  are  based  on  heuristic  criteria  and  require 
redundant  sensor  measurements,  while  the  proposed 
algorithm  is  statistically  based  and  requires  no  additional 
information  other  than  the  original  waveform. 

As  biosensors  become  ubiquitous  in  everyday  life,  it  is 
important  that  we  continue  to  develop  algorithms  which  can 


improve  our  ability  to  automatically  assess  the  reliability  of 
vital  signs  while  simultaneously  attempting  to  develop  more 
reliable  sensors  for  physiological  data  collection.  For  both 
civilian  and  military  applications,  it  is  particularly  important 
to  infer  reliable  values  of  HRs  and  RRs  collected  from 
austere,  unstructured  environments,  such  as  a  battlefield, 
during  the  transport  of  trauma  patients,  in-home  care  of 
elderly  patients,  and  in  the  monitoring  of  active  individuals 
during  physical  activity,  where  the  original  physiological 
waveforms  are  prone  to  be  contaminated  with  noise  artifacts. 
The  study  presented  here  suggests  that  statistical  CIs  can  be 
used  as  a  systematic,  analytical  approach  to  automatically 
assess  the  reliability  of  field-collected  HRs  and  RRs. 

Disclaimer 

The  opinions  and  assertions  contained  herein  are  the 
private  views  of  the  authors  and  are  not  to  be  construed  as 
official  or  as  reflecting  the  views  of  the  U.S.  Army  or  of  the 
U.S.  Department  of  Defense.  This  paper  has  been  approved 
for  public  release  with  unlimited  distribution. 
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